Ingeniería Biomédica
2025-07-16
Suposse…
A dataset of M tuples \((\mathbf{x}_i, \mathbf{y}_i)\) with i = 1, …, M.
What it is a neural network
Is a mathematical function (sometimes called a network function) that takes some kind of input (typically multi-dimensional) called x iand generate some output.
Network function
Importante
A neural network is nothing more than a mathematical function that depends on a set of parameters that are tuned, hopefully in some smart way, to make the network output as close as possible to some expected output.
Learning
$ _{_k ^N} L ( f ( _k, _i ), _i ) $
\(\min_{\mathbf{\theta}_k \in \mathbb{R}^N} L \left( f \left( \mathbf{\theta}_k, \mathbf{x}_i \right), \mathbf{y}_i \right)\) subject to \(c_q, q=1,2,3,\ldots,Q\) with \(Q \in \mathbb{N}\)
The learning process is the search of a minima. However, most of the algorithms can search only a “local” minima.
In principle, we want to find the global minimum or, in other words, the point for which the function value is the smallest between all possible points.
Taken from GeeksforGeeks
Neural Networks have a great number of internal parameters for learning; which varying in a vast range of values.
This number of parameters is fundamental for neural network knowledge representation
Problem
But if this number increases too much the neural network is prone to overfitting
Definition
Regularization techniques reduce the possibility of a neural network overfitting by constraining the range of values that the weight values within the network hold.
\[\begin{eqnarray} L(\mathbf{x},\mathbf{y}) = \sum_{k=1}^N \left( y_k - f \left( x_k \right) \right)^2 \\ f \left( x \right) = \theta_0+\theta_1 x + \theta_2 x^2 + \theta_3 x^3 + \theta_4 x^4 \\ f \left( x \right) = \theta_0+\theta_1 x + \theta_2 x^2 \end{eqnarray}\]
\[\begin{eqnarray} L(\mathbf{x},\mathbf{y}) = \sum_{k=1}^N \left( y_k - f \left( x_k \right) \right)^2 + \lambda \sum_{k=1}^N \theta_k \end{eqnarray}\]
Nota
\(\lambda\) is the penalty term or regularization parameter which determines how much to penalizes the weights.
L1 Regularization or Lasso or L1 norm
L1 penalizes sum of absolute value of weights.
L1 has a sparse solution.
L1 has multiple solutions.
L1 has built in feature selection.
L1 is robust to outliers.
L1 generates model that are simple and interpretable but cannot learn complex patterns.
\[\begin{eqnarray} L(\mathbf{x},\mathbf{y}) = \sum_{k=1}^N \left( y_k - f \left( x_k \right) \right)^2 + \lambda \sum_{k=1}^N \lvert \theta_k \rvert \end{eqnarray}\]
L2 Regularization or Ridge Regularization
L2 regularization penalizes sum of square weights.
L2 has a non sparse solution
L2 has one solution
L2 has no feature selection
L2 is not robust to outliers
L2 gives better prediction when output variable is a function of all input features
L2 regularization is able to learn complex data patterns
\[\begin{eqnarray} L(\mathbf{x},\mathbf{y}) = \sum_{k=1}^N \left( y_k - f \left( x_k \right) \right)^2 + \lambda \sum_{k=1}^N \theta_k^2 \end{eqnarray}\]
Regression
Classification
Confusion Matrix
“Also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one.”
True Negative
Values that being negative have been classified as negative
True Positive
Values that being positive have been classified as positive
False Positive
Values that being negative have been classified as positive
False Negative
Values that being negative have been classified as positive
Sensitivity or Recall
How good is my classifier at detecting positive cases? \[ \frac{TP}{TP+FN} \]
Specificity
How good is my classifier at avoiding negative cases? \[ \frac{TN}{TN+FP} \]
Precision
How credible is my classifier when it detects a positive case? \[\frac{TP}{TP+FP}\]
Accuracy and Balance Accuracy
How many cases the classifier correctly identifies? \[Accuracy = \frac{TP+TN}{TP+FP+FN+TN}\] \[BalancedAccuracy = \frac{Specificity+Sensitivity}{2}\]
Prevalence
How often does the positive condition actually occur in our sample? \[\frac{TP+FN}{TP+FP+FN+TN}\]
Detection Rate
Percentage of true positives \[\frac{TP}{TP+FP+FN+TN}\]
Detection Prevalence
Percentage of positives \[\frac{TP+FP}{TP+FP+FN+TN}\]
Harmonic mean of recall and precision.
\[2\frac{\left( Precision \right) \left( Sensitivity \right)}{Precision+Sensitivity}\]
For Class 1
For Class 2
For Class 3